智能论文笔记

Human Pose Estimation from Sparse Inertial Measurements through Recurrent Graph Convolution

Patrik Puchert , Timo Ropinski

分类：计算机视觉 | 机器学习

2021-07-23

人类姿势估计的常规方法要么通过依靠许多惯性测量单元（IMU）或通过依赖外部摄像头来限制记录空间，要么需要高度的仪器。这些缺陷是通过从稀疏IMU数据中估计人姿势估计的方法来解决的。我们定义邻接自适应图卷积长期记忆网络（AAGC-LSTM），以基于六个IMU的人体姿势估计，同时将人体图形结构直接纳入网络。 AAGC-LSTM在单个网络操作中结合了空间依赖性和时间依赖性，比以前的方法更有效地内存。通过将图形卷积装置为邻接的适应性，这可以使其成为可能，从而消除了深层或经常性图网络中信息丢失的问题，同时还可以学习人体关节之间的未知依赖性。为了进一步提高准确性，我们提出纵向减肥来考虑自然运动模式。通过我们提出的方法，我们能够利用人体的固有图形本质，因此可以超越最稀疏IMU数据的人类姿势估计的最新状态（SOTA）。

translated by 谷歌翻译

Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Jože M. Rožanec , Patrik Zajec , Spyros Theodoropoulos , Erik Koehorst , Blaž Fortuna , Dunja Mladenić

分类：计算机视觉 | 机器学习

2022-12-19

Industry 4.0 aims to optimize the manufacturing environment by leveraging new technological advances, such as new sensing capabilities and artificial intelligence. The DRAEM technique has shown state-of-the-art performance for unsupervised classification. The ability to create anomaly maps highlighting areas where defects probably lie can be leveraged to provide cues to supervised classification models and enhance their performance. Our research shows that the best performance is achieved when training a defect detection model by providing an image and the corresponding anomaly map as input. Furthermore, such a setting provides consistent performance when framing the defect detection as a binary or multiclass classification problem and is not affected by class balancing policies. We performed the experiments on three datasets with real-world data provided by Philips Consumer Lifestyle BV.

translated by 谷歌翻译

Synthetic Data Augmentation Using GAN For Improved Automated Visual Inspection

Jože M. Rožanec , Patrik Zajec , Spyros Theodoropoulos , Erik Koehorst , Blaž Fortuna , Dunja Mladenić

分类：计算机视觉 | 人工智能

2022-12-19

Quality control is a crucial activity performed by manufacturing companies to ensure their products conform to the requirements and specifications. The introduction of artificial intelligence models enables to automate the visual quality inspection, speeding up the inspection process and ensuring all products are evaluated under the same criteria. In this research, we compare supervised and unsupervised defect detection techniques and explore data augmentation techniques to mitigate the data imbalance in the context of automated visual inspection. Furthermore, we use Generative Adversarial Networks for data augmentation to enhance the classifiers' discriminative performance. Our results show that state-of-the-art unsupervised defect detection does not match the performance of supervised models but can be used to reduce the labeling workload by more than 50%. Furthermore, the best classification performance was achieved considering GAN-based data generation with AUC ROC scores equal to or higher than 0,9898, even when increasing the dataset imbalance by leaving only 25\% of the images denoting defective products. We performed the research with real-world data provided by Philips Consumer Lifestyle BV.

translated by 谷歌翻译

Reinforcement Learning in an Adaptable Chess Environment for Detecting Human-understandable Concepts

Patrik Hammersborg , Inga Strümke

分类：机器学习 | 人工智能

2022-11-10

Self-trained autonomous agents developed using machine learning are showing great promise in a variety of control settings, perhaps most remarkably in applications involving autonomous vehicles. The main challenge associated with self-learned agents in the form of deep neural networks, is their black-box nature: it is impossible for humans to interpret deep neural networks. Therefore, humans cannot directly interpret the actions of deep neural network based agents, or foresee their robustness in different scenarios. In this work, we demonstrate a method for probing which concepts self-learning agents internalise in the course of their training. For demonstration, we use a chess playing agent in a fast and light environment developed specifically to be suitable for research groups without access to enormous computational resources or machine learning models.

translated by 谷歌翻译

NVRadarNet: Real-Time Radar Obstacle and Free Space Detection for Autonomous Driving

Alexander Popov , Patrik Gebhardt , Ke Chen , Ryan Oldja , Heeseok Lee , Shane Murray , Ruchi Bhargava , Nikolai Smolyanskiy

分类：计算机视觉 | 机器学习 | 机器人

2022-09-29

检测障碍对于安全有效的自动驾驶至关重要。为此，我们提出了NVRadarnet，这是一种深神经网络（DNN），它使用汽车雷达传感器检测动态障碍物和可驱动的自由空间。该网络利用从多个雷达传感器的时间积累的数据来检测动态障碍，并在自上而下的鸟类视图（BEV）中计算其方向。该网络还可以回归可驱动的自由空间，以检测未分类的障碍。我们的DNN是第一个使用稀疏雷达信号的同类DNN，以实时从雷达数据实时执行障碍物和自由空间检测。在实际的自动驾驶场景中，该网络已成功地用于我们的自动驾驶汽车。该网络在嵌入式GPU上的运行速度快于实时时间，并且在地理区域显示出良好的概括。

translated by 谷歌翻译

Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Jože M. Rožanec , Luka Bizjak , Elena Trajkova , Patrik Zajec , Jelle Keizer , Blaž Fortuna , Dunja Mladenić

分类：机器学习 | 人工智能 | 计算机视觉

2022-09-12

质量控制是制造业企业进行的至关重要的活动，以确保其产品符合质量标准并避免对品牌声誉的潜在损害。传感器成本下降和连接性使制造业数字化增加。此外，人工智能可实现更高的自动化程度，减少缺陷检查所需的总体成本和时间。这项研究将三种活跃的学习方法（与单一和多个牙齿）与视觉检查进行了比较。我们提出了一种新颖的方法，用于对分类模型的概率校准和两个新的指标，以评估校准的性能而无需地面真相。我们对飞利浦消费者生活方式BV提供的现实数据进行了实验。我们的结果表明，考虑到p = 0.95的阈值，探索的主动学习设置可以将数据标签的工作减少3％至4％，而不会损害总体质量目标。此外，我们表明所提出的指标成功捕获了相关信息，否则仅通过地面真实数据最适合使用的指标可用。因此，所提出的指标可用于估计模型概率校准的质量，而无需进行标签努力以获取地面真相数据。

translated by 谷歌翻译

Neural apparent BRDF fields for multiview photometric stereo

Meghna Asthana , William A. P. Smith , Patrik Huber

分类：计算机视觉

2022-07-14

我们建议使用以光源方向为条件的神经辐射场（NERF）的扩展来解决多视光度立体声问题。我们神经表示的几何部分预测表面正常方向，使我们能够理解局部表面反射率。我们的神经表示的外观部分被分解为神经双向反射率函数（BRDF），作为拟合过程的一部分学习，阴影预测网络（以光源方向为条件），使我们能够对明显的BRDF进行建模。基于物理图像形成模型的诱导偏差的学到的组件平衡使我们能够远离训练期间观察到的光源和查看器方向。我们证明了我们在多视光学立体基准基准上的方法，并表明可以通过NERF的神经密度表示可以获得竞争性能。

translated by 谷歌翻译

Teachers in concordance for pseudo-labeling of 3D sequential data

Awet Haileslassie Gebrehiwot , Patrik Vacek , David Hurych , Karel Zimmermann , Patrick Perez , Tomáš Svoboda

分类：计算机视觉 | 机器人

2022-07-13

自动伪标记是一种强大的工具，可以利用大量的连续未标记数据。在绩效要求非常大，数据集和手动标记的自动驾驶的关键安全应用中，它特别有吸引力。我们建议利用捕获的顺序性，通过培训多个教师在教师的设置中提高伪标记技术，每个教师都可以访问不同的时间信息。这套被称为一致性的教师比标准方法为学生培训提供了更高质量的伪标签。多个教师的输出通过新颖的伪标记信心引导的标准组合。我们的实验评估集中在城市驾驶场景中的3D点云域。我们显示了我们的方法的性能，应用于多个模型体系结构，其中包含3D语义分割任务和两个基准数据集上的3D对象检测。我们的方法仅使用20％的手动标签，优于某些完全监督的方法。对于培训数据，例如自行车和行人，很少出现在培训数据中的课程方面的特殊表现提升。我们的方法的实现可在https://github.com/ctu-vras/t-concord3d上公开获得。

translated by 谷歌翻译

Gated Domain Units for Multi-source Domain Generalization

Simon Föll , Alina Dubatovka , Eugen Ernst , Martin Maritsch , Patrik Okanovic , Gudrun Thäter , Joachim M. Buhmann , Felix Wortmann , Krikamol Muandet

分类：机器学习

2022-06-24

分销转移（DS）是一个常见的问题，可恶化学习机器的性能。为了克服这个问题，我们假设现实世界的分布是由基本分布组成的，这些分布在不同域之间保持不变。我们将其称为不变的基本分布（即）假设。因此，这种不变性使知识转移到看不见的域。为了利用该假设在域概括（DG）中，我们开发了一个由门域单位（GDU）组成的模块化神经网络层。每个GDU都学会了单个基本领域的嵌入，使我们能够在训练过程中编码域相似性。在推断期间，GDU在观察和每个相应的基本分布之间进行了计算相似性，然后将其用于形成学习机的加权集合。由于我们的层是经过反向传播的训练，因此可以轻松地集成到现有的深度学习框架中。我们对Digits5，ECG，CamelyOn17，IwildCam和FMOW的评估显示出对训练的目标域的性能有显着改善，而无需从目标域访问数据。这一发现支持了即现实世界数据分布中的假设。

translated by 谷歌翻译

Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View Mammograms

Xuxin Chen , Ke Zhang , Neman Abdoli , Patrik W. Gilley , Ximin Wang , Hong Liu , Bin Zheng , Yuchen Qiu

分类：计算机视觉 | 人工智能

2022-06-21

深度卷积神经网络（CNN）已被广泛用于各种医学成像任务。但是，由于卷积操作的内在局部性，CNN通常不能很好地对远距离依赖性进行建模，这对于准确识别或映射从未注册的多个乳房X线照片计算出的相应乳腺病变特征很重要。这促使我们利用多视觉视觉变形金刚的结构来捕获一项检查中同一患者的多个乳房X线照片的远程关系。为此，我们采用局部变压器块来分别学习从两侧（右/左）乳房的两视图（CC/MLO）获得的四个乳房X线照片中。来自不同视图和侧面的输出被串联并馈入全球变压器块，以共同学习四个代表左乳房和右乳房两种不同视图的图像之间的贴片关系。为了评估提出的模型，我们回顾性地组装了一个涉及949套乳房X线照片的数据集，其中包括470例恶性病例和479例正常情况或良性病例。我们使用五倍的交叉验证方法训练和评估了模型。没有任何艰苦的预处理步骤（例如，最佳的窗户裁剪，胸壁或胸肌去除，两视图图像注册等），我们的四个图像（两视频两侧）基于变压器的模型可实现案例分类性能在ROC曲线下的面积（AUC = 0.818），该区域的表现明显优于AUC = 0.784，而最先进的多视图CNN（p = 0.009）实现了0.784。它还胜过两个单方面模型，分别达到0.724（CC视图）和0.769（MLO视图）。该研究表明，使用变压器开发出高性能的计算机辅助诊断方案，这些方案结合了四个乳房X线照片。

translated by 谷歌翻译